A literature-driven method to calculate similarities among diseases
نویسندگان
چکیده
BACKGROUND "Our lives are connected by a thousand invisible threads and along these sympathetic fibers, our actions run as causes and return to us as results". It is Herman Melville's famous quote describing connections among human lives. To paraphrase the Melville's quote, diseases are connected by many functional threads and along these sympathetic fibers, diseases run as causes and return as results. The Melville's quote explains the reason for researching disease-disease similarity and disease network. Measuring similarities between diseases and constructing disease network can play an important role in disease function research and in disease treatment. To estimate disease-disease similarities, we proposed a novel literature-based method. METHODS AND RESULTS The proposed method extracted disease-gene relations and disease-drug relations from literature and used the frequencies of occurrence of the relations as features to calculate similarities among diseases. We also constructed disease network with top-ranking disease pairs from our method. The proposed method discovered a larger number of answer disease pairs than other comparable methods and showed the lowest p-value. CONCLUSIONS We presume that our method showed good results because of using literature data, using all possible gene symbols and drug names for features of a disease, and determining feature values of diseases with the frequencies of co-occurrence of two entities. The disease-disease similarities from the proposed method can be used in computational biology researches which use similarities among diseases.
منابع مشابه
مکانیابی خطاهای پنهان نرم افزار با استفاده از آنتروپی متقاطع و مدلهای n-گرام
The aim is to automate the process of bug localization in program source code. The cause of program failure could be best determined by comparing and analyzing correct and incorrect execution paths generated by running the instrumented program with different failing and passing test cases. To compare and analysis the execution paths, one approach is clustering the paths according to their simil...
متن کاملApplication of the Genetic Algorithm to Calculate the Interaction Parameters for Multiphase and Multicomponent Systems
A method based on the Genetic Algorithm (GA) was developed to study the phase behavior of multicomponent and multiphase systems. Upon application of the GA to the thermodynamic models which are commonly used to study the VLE, VLLE and LLE phase equilibria, the physically meaningful values for the Binary Interaction Parameters (BIP) of the models were obtained. Using the method proposed in t...
متن کاملThe Language of Two Homologous Aesthetic Notions: Sufism and Surrealism
There are hypotheses in the history of human culture and civilization in which we can find undeniable similarities and commonalities among them, despite their vastly different cultural and historical contexts. For example, finding commonalities between the language of two artistic aesthetic hypotheses, Sufism and Surrealism, which are very different from each other in terms of context, time as ...
متن کاملTreatment and prevention of acute respiratory infections among Iranian hajj pilgrims: a 5-year follow up study and review of the literature
Background: Respiratory diseases/syndromes are the most common causes of referring to physicians among pilgrims in Hajj . They lead to high morbidity , impose high costs on the health system and are among the major obstacles for pilgrims to perform Hajj duties. The main aim of our study was to determine types, frequencies, etiologies, and epidemiologic factors of respiratory diseases among Ir...
متن کاملنمونهگیری پاسخگو محور در مقایسه با سایر روشهای نمونهگیری از جوامع پنهان
Sampling hidden populations is challenging due to the lack of convenience statistical frames. Since most populations exposed to special diseases are hidden and hard to reach, sampling methods that produce representative and efficient samples from the populations have become a study subject for researches all over the world. Because of the unknown probability of selecting samples in conventional...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computer methods and programs in biomedicine
دوره 122 2 شماره
صفحات -
تاریخ انتشار 2015